Marginalizing shocks: A step back to stay on track

Cover that shocks which I ought not to see.

1 Model reparameterization

We marginalize the shocks in the mixture model! Going from:

# Non-shocky
target += normal_lpdf(y[n] | mu[n] + delta_sck[n], sigma)
target += normal_lpdf(delta_sck[n] | 0, inner_tau_sck)

# Shocky
target += normal_lpdf(y[n] | mu[n] + delta_sck[n], sigma)
target += normal_lpdf(delta_sck[n] | 0, outer_tau_sck)

to a log-likelihood where shocks are marginalized out:

# Non-shocky
target += normal_lpdf(y[n] | mu[n], eta)

# Shocky
target += normal_lpdf( y[n] | mu[n], eta * sqrt{1 + nu / eta^2})

where: \eta = \sqrt{ \sigma^2 + {\tau_\text{inner}^\text{shock}}^2 } \nu = {\tau_\text{outer}^\text{shock}}^2-{\tau_\text{inner}^\text{shock}}^2

writeLines(
  readLines(file.path(wd, 'model', 'stan', 
                      'model9_marginalshocks.stan'))[230:315])
model {
  
  alpha ~ normal(0, 0.69); // 0.2 mm <~ exp(alpha) * 1 mm <~ 5 mm
  
  beta_gdd ~ normal(0, log(1.8) / 2.57); // -log(1.8) <~ beta_gsl <~ log(1.8)
  beta_sm ~ normal(0, log(1.8) / 2.57); // -log(1.8) <~ beta_gsl <~ log(1.8)
  beta_vpd ~ normal(0, log(1.8) / 2.57); // -log(1.8) <~ beta_gsl <~ log(1.8)
  
  rho_sp ~ lognormal(2.65, 0.135); // 10 <~ rho <~ 20
  
  gamma_sp ~ normal(0, log(10) / 2.57); // 0 <~ gamma <~ log(10)
  
  for (s in 1:N_stands)
    f_tilde_sh[s] ~ normal(0, 1);
  rho_sh ~ lognormal(1.7, 0.26);       // 3 <~ rho_sh <~ 10
  gamma_sh ~ normal(0, log(3) / 2.57); // 0 <~ gamma_sh <~ log(3)
  
  eta ~ gamma(6, 43.5);
  nu ~ gamma(0.5, 0.033); 
  phi_sck ~ beta(2, 20); 
  omega_conc_sck ~ beta(230, 14); // 0.9 <~ omega_conc_sck <~ 0.97 (most trees, but not ALL trees)
  omega_nonconc_sck ~ beta(1, 20); // 0 <~ omega_conc_sck <~ 0.15 (should be rare, but... who knows?)
  
  vector[N] mu;
  for (t in 1:N_trees) {
    array[N_years[t]] int tree_idxs
    = linspaced_int_array(N_years[t],
                          tree_start_idxs[t],
                          tree_end_idxs[t]);

    array[N_years[t]] int all_years_idxs_tree = all_years_idxs[tree_idxs];
    array[N_years[t]] real years_tree = years[tree_idxs];

    int stand_idx = stand_idxs[t];
    int species_idx = species_idxs[t];

    vector[N_years[t]] gdd_obs_tree = gdd_obs[tree_idxs];
    vector[N_years[t]] sm_obs_tree = sm_obs[tree_idxs];
    vector[N_years[t]] vpd_obs_tree = vpd_obs[tree_idxs];

    mu[tree_idxs] =  alpha
    + beta_gdd[species_idx] * (gdd_obs_tree - gdd0)
    + beta_sm[species_idx] * (sm_obs_tree - sm0)
    + beta_vpd[species_idx] * (vpd_obs_tree - vpd0)
    + f_sh[stand_idx, all_years_idxs_tree];
    
    f_tilde[tree_idxs] ~ normal(0,1);
  }
  
  // Observational model with marginalized shocks
  for (s in 1:N_stands) {
    
    array[N_stand_trees[s]] int stand_trees_idxs = linspaced_int_array(N_stand_trees[s], 
      stand_trees_start_idxs[s], stand_trees_end_idxs[s]);
    vector[N_stand_years[s]] log_p0 = rep_vector(0, N_stand_years[s]);
    vector[N_stand_years[s]] log_p1 = rep_vector(0, N_stand_years[s]);
    
    for(ts in 1:N_stand_trees[s]){
      int t = stand_trees_idxs[ts];
      array[N_years[t]] int tree_idxs = linspaced_int_array(N_years[t], tree_start_idxs[t], tree_end_idxs[t]);
      array[N_years[t]] int all_years_idxs_tree = all_years_idxs[tree_idxs];
      
      for(y in 1:N_years[t]) {
        int ys = all_years_idxs_tree[y]-stand_start_years_idxs[s]+1;
        
        log_p0[ys] += log_mix(omega_nonconc_sck,
                          normal_lpdf(log_rw_obs[tree_idxs[y]] | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], eta * sqrt(1 + nu / eta^2)),
                          normal_lpdf(log_rw_obs[tree_idxs[y]] | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], eta));
        log_p1[ys] += log_mix(omega_conc_sck,
                          normal_lpdf(log_rw_obs[tree_idxs[y]] | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], eta * sqrt(1 + nu / eta^2)),
                          normal_lpdf(log_rw_obs[tree_idxs[y]] | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], eta));
        
      }
    }
    
    for(y in 1:N_stand_years[s]) {
      target += log_mix(phi_sck[s], log_p1[y], log_p0[y]);
    }
  }
  
}

2 Recover the shocks afterwards

Our large shocks are likely tied to interesting ecological stuff… But by marginalizing them, we don’t have inferences on them anymore.
It’s easy to recover the shock state, but what about the shock amplitude?
If I assume that {\tau_\text{inner}^\text{shock}} = 0, I can reconstruct the large shocks using the normal-normal conjugancy: \delta_\text{shock}(k) ~|~ y_k \sim \text{normal} \left( ({\tau_\text{outer}^\text{shock}}^2 / ({\tau_\text{outer}^\text{shock}}^2 + \sigma^2)) \times y_k ~,~ \sqrt{({\tau_\text{outer}^\text{shock}}^2 \times \sigma^2) / ({\tau_\text{outer}^\text{shock}}^2 + \sigma^2)}\right)

3 Time to deal with those zeros!

Before digging too much on the shocks, I should first consider to deal with the missing rings.
We have domain expertise to say that zeros are a signature of extreme tree stress and very low growth. These missing rings correspond to observations that are below the measurement precision, which should be around 1\mathrm{e}{-3} ~\text{mm}! We should thus consider that the probability of having a missing ring corresponds to the probability of getting values from -Inf to some threshold epsilon (which can be fixed, or a parameter to infer). The pseudo-code would be:

if(rw_obs[tree_idxs[y]] >= epsilon){
  target += normal_lpdf(log(rw_obs[tree_idxs[y]]) | mu[n], sqrt(outer_tau2_aux+sigma^2))
}else{
  target += normal_lcdf(log(epsilon)) | mu[n], sqrt(outer_tau2_aux+sigma^2))
}
Warning

Here I used the condition rw_obs[tree_idxs[y]] >= epsilon, because epsilon is the “last” observed value before zero. I do have a minor concern, though: log(epsilon) is also included in the normal_lcdf integral… Is it a big issue?

I modified the Stan code to include this censoring.

writeLines(
  readLines(file.path(wd, 'model', 'stan', 
                      'model9_marginalshocks_zerotauinner_censoring_onlyconcordant.stan'))[310:326])
        if(rw_obs[tree_idxs[y]] > epsilon){
          log_p0[ys] += normal_lpdf(log(rw_obs[tree_idxs[y]]) | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], sigma);
          log_p1[ys] += log_mix(omega_conc_sck,
                          normal_lpdf(log(rw_obs[tree_idxs[y]]) | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], sqrt(outer_tau_sck^2 + sigma^2)),
                          normal_lpdf(log(rw_obs[tree_idxs[y]]) | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], sigma));
        }else{
          log_p0[ys] += normal_lcdf(log(epsilon) | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], sigma);
          log_p1[ys] += log_mix(omega_conc_sck,
                          normal_lcdf(log(epsilon) | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], sqrt(outer_tau_sck^2 + sigma^2)),
                          normal_lcdf(log(epsilon) | mu[tree_idxs[y]] 
                          + f[tree_idxs[y]], sigma));
        }

But I am less sure on how to handle this in the generated quantities block.
For now, when the observation y is below the threshold \epsilon, I sample a \log(y) in a truncated normal distribution (between -\infty and \log(\epsilon)).

writeLines(
  readLines(file.path(wd, 'model', 'stan', 
                      'model9_marginalshocks_zerotauinner_censoring_onlyconcordant.stan'))[394:411])
      if(sck_state[tree_idxs[y]] == 0){
        log_rw_pred[tree_idxs[y]] = normal_rng(mu[y] + f[tree_idxs[y]], sigma);
      }else if(rw_obs[tree_idxs[y]] > epsilon){
        // we can reconstruct shock posterior using the normal-normal conjugancy
        real residual = log(rw_obs[tree_idxs[y]]) - mu[y] - f[tree_idxs[y]];
        real conjugate_mean = (outer_tau_sck^2 / (outer_tau_sck^2 + sigma^2)) * residual;
        real conjugate_sd   = sqrt((outer_tau_sck^2 * sigma^2) / (outer_tau_sck^2 + sigma^2));
        delta_sck[tree_idxs[y]] = normal_rng(conjugate_mean, conjugate_sd);
        log_rw_pred[tree_idxs[y]] = normal_rng(mu[y] + f[tree_idxs[y]] + delta_sck[tree_idxs[y]], sigma);
      }else{
        // sample from a truncated normal distribution? between -inf and log(epsilon)
        real log_y = normal_ub_rng(mu[y] + f[tree_idxs[y]], sqrt(outer_tau_sck^2 + sigma^2), log(epsilon));
        real residual = log_y - mu[y] - f[tree_idxs[y]];
        real conjugate_mean = (outer_tau_sck^2 / (outer_tau_sck^2 + sigma^2)) * residual;
        real conjugate_sd   = sqrt((outer_tau_sck^2 * sigma^2) / (outer_tau_sck^2 + sigma^2));
        delta_sck[tree_idxs[y]] = normal_rng(conjugate_mean, conjugate_sd);
        log_rw_pred[tree_idxs[y]] = normal_rng(mu[y] + f[tree_idxs[y]] + delta_sck[tree_idxs[y]], sigma);
      }

I defined the truncated normal sampling function with the help of Stan manual.
I add a condition to deal with p_ub = 0, when the probability that a sample from normal(mu, sigma) is below ub is zero. In this case, I consider that the entire density collapses to a single point, ub (i.e. \log(\epsilon)).

writeLines(
  readLines(file.path(wd, 'model', 'stan', 
                      'model9_marginalshocks_zerotauinner_censoring_onlyconcordant.stan'))[57:67])
  // see truncated-random-number-generation on Stan's user guide
  real normal_ub_rng(real mu, real sigma, real ub) {
    real p_ub = normal_cdf(ub, mu, sigma);
    // deal with the limit case of the truncated normal
    // to avoid error: "Exception: uniform_rng: Upper bound parameter is 0"
    if (p_ub == 0)
      return ub;
    real u = uniform_rng(0, p_ub);
    real y = mu + sigma * inv_Phi(u);
    return y;
  }

Currently I just set: real epsilon = 1e-3;. But we could fit this parameter later.

4 A first fit on two stands

Here, we consider that:

  • we have a normal population model for the shocks, with {\tau_\text{inner}^\text{shock}} = 0 (see above)
  • \phi_\text{shock}, the stand-level shock probability, varies across stands (heterogeneous)
  • \omega_\text{shock}^\text{concordant}, the tree-level shock probability given the stand is shocking, does not vary across stands (homogeneous)
  • \omega_\text{shock}^\text{non-concordant} = 0 (no tree-level shock without a stand-level shock)
data <- readRDS(file = file.path(wd, 'output', 'model', 'data_15nov2025_az592_az628.rds'))
data$rw_obs <- ifelse(data$log_rw_obs == log(1e-4), 0, exp(data$log_rw_obs))

if(FALSE){
  fit <- stan(file=file.path(wd, 'model', 'stan', 'model9_marginalshocks_zerotauinner_censoring.stan'),
            data=data, seed=5838293, chains = 4,
            warmup=1000, iter=2024, refresh=10)
  saveRDS(fit, file.path(wd, 'model/shocks/output/marginal', 'fit_normal_censoring.rds'))
}else{
  fit <- readRDS(file.path(wd, 'model/shocks/output/marginal', 'fit_normal_censoring.rds'))
}

Woaa! A flood of tree depth warnings.

diagnostics <- util$extract_hmc_diagnostics(fit)
util$check_all_hmc_diagnostics(diagnostics)
  Chain 1: 1024 of 1024 transitions (100.00%)
           saturated the maximum treedepth of 10.

  Chain 2: 1024 of 1024 transitions (100.00%)
           saturated the maximum treedepth of 10.

  Chain 3: 1024 of 1024 transitions (100.00%)
           saturated the maximum treedepth of 10.

  Chain 4: 1024 of 1024 transitions (100.00%)
           saturated the maximum treedepth of 10.

  Numerical trajectories that saturate the maximum treedepth have
terminated prematurely.  Increasing max_depth above 10 should result in
more expensive, but more efficient, Hamiltonian transitions.
samples <- util$extract_expectand_vals(fit)
base_samples <- util$filter_expectands(samples,
                                       c('alpha', 
                                         'beta_gdd', 'beta_sm', 'beta_vpd',
                                         'rho_sh', 'gamma_sh',
                                         'rho_sp', 'gamma_sp',
                                         'sigma', 
                                         'phi_sck', 'omega_conc_sck'),
                                       check_arrays = TRUE)
util$check_all_expectand_diagnostics(base_samples)
rho_sp[1]:
  Chain 1: hat{ESS} (71.699) is smaller than desired (100).
  Chain 2: hat{ESS} (72.291) is smaller than desired (100).
  Chain 3: hat{ESS} (93.643) is smaller than desired (100).
  Chain 4: hat{ESS} (80.025) is smaller than desired (100).


Small empirical effective sample sizes result in imprecise Markov chain
Monte Carlo estimators.

Let’s do some retrodictive checks! Let’s look at 4 trees on the first stand:

The model seems to over-regularize the large negative shocks. And for the first tree, it compensates for the stand-level GP shock in 2022 with a positive shock. We obtain a \tau_\text{outer}^\text{shock} rather small compared to what our broad prior would suggest:

5 Considering a heterogeneous \omega_\text{shock}^\text{concordant}

We add some flexibility to our model! Now, we consider that:

  • \phi_\text{shock} and \omega_\text{shock}^\text{concordant} varies across stands (heterogeneous)

5.1 On 2 stands

if(FALSE){
  fit2 <- stan(file=file.path(wd, 'model', 'stan', 'model9_marginalshocks_zerotauinner_censoring_onlyconcordant_heterogeneous.stan'),
              data=data, seed=5838293, chains = 4,
              warmup=1000, iter=2024, refresh=10)
  saveRDS(fit2, file.path(wd, 'model/shocks/output/marginal', 'fit_normal_censoring_heterogeneous.rds'))
}else{
  fit2 <- readRDS(file.path(wd, 'model/shocks/output/marginal', 'fit_normal_censoring_heterogeneous.rds'))
}

Still a good amount of warnings! But only for one chain now.

diagnostics <- util$extract_hmc_diagnostics(fit2)
util$check_all_hmc_diagnostics(diagnostics)
  Chain 2: 1024 of 1024 transitions (100.00%)
           saturated the maximum treedepth of 10.

  Numerical trajectories that saturate the maximum treedepth have
terminated prematurely.  Increasing max_depth above 10 should result in
more expensive, but more efficient, Hamiltonian transitions.

However the parameter-level warnings are more concerning.

samples2 <- util$extract_expectand_vals(fit2)
base_samples <- util$filter_expectands(samples2,
                                       c('alpha', 
                                         'beta_gdd', 'beta_sm', 'beta_vpd',
                                         'rho_sh', 'gamma_sh',
                                         'rho_sp', 'gamma_sp',
                                         'sigma', 
                                         'phi_sck', 'omega_conc_sck'),
                                       check_arrays = TRUE)
util$check_all_expectand_diagnostics(base_samples)
beta_gdd[1]:
  Split hat{R} (1.191) exceeds 1.1.

beta_sm[1]:
  Split hat{R} (1.445) exceeds 1.1.

gamma_sh:
  Split hat{R} (3.616) exceeds 1.1.

rho_sp[1]:
  Split hat{R} (1.313) exceeds 1.1.
  Chain 1: hat{ESS} (92.496) is smaller than desired (100).
  Chain 2: hat{ESS} (89.945) is smaller than desired (100).

sigma:
  Split hat{R} (5.155) exceeds 1.1.

phi_sck[1]:
  Split hat{R} (1.314) exceeds 1.1.

phi_sck[2]:
  Split hat{R} (1.849) exceeds 1.1.

omega_conc_sck[1]:
  Split hat{R} (1.218) exceeds 1.1.

omega_conc_sck[2]:
  Split hat{R} (1.522) exceeds 1.1.


Split Rhat larger than 1.1 suggests that at least one of the Markov
chains has not reached an equilibrium.

Small empirical effective sample sizes result in imprecise Markov chain
Monte Carlo estimators.

We can look at the pair plot between \sigma and \gamma_\text{short}:

Looking at the two stand GPs reveals the two modes for \gamma_\text{short}… at least on stand 1.

If we look at the pair plot between \phi_\text{shock} and \omega_\text{shock}^\text{concordant}, for stand 1:

And stand 2:

However, the retrodictive check on the trees of stand 1 seems to indicate that we are on a good path!

On stand 2, almost no years are considered as shocks.

par(mfcol = c(4,2))
for(t in 41:44){
  idxs_t <- data$tree_start_idxs[t]:data$tree_end_idxs[t]
  names <- sapply(idxs_t,
                  function(x) paste0('log_rw_pred[', x, ']'))
  util$plot_conn_pushforward_quantiles(samples2, names, data$years[idxs_t],
                                       xlab="", ylab='Predictions (log)', main = paste0('Tree ', data$uniq_tree_ids[t]),
                                       display_ylim=c(-8, 3), display_xlim = c(1980, 2022))
  points(data$years[idxs_t], data$log_rw_obs[idxs_t], pch=16, cex=1, col="white")
  points(data$years[idxs_t], data$log_rw_obs[idxs_t], pch=16, cex=0.6, col="black")
  abline(v =  c(2002,2006), lty = 3, col="darkgrey")
  
  names <- sapply(idxs_t,
                  function(x) paste0('sck_state[', x, ']'))
  ticks <- c(unlist(lapply(seq(1980, 2010,10), function(x) c(x, rep(NA, 9)))), '2020',NA,NA)
  util$plot_disc_pushforward_quantiles(samples2, names, data$years[idxs_t],
                                       xlab="", ylab='Shock state', 
                                       display_ylim=c(-0.1, 1.1), 
                                       xticklabs =  ticks)
  abline(v =  which(data$all_years %in% c(2002, 2006)), lty = 3, col="darkgrey")
  
  names <- sapply(idxs_t,
                  function(x) paste0('delta_sck[', x, ']'))
  util$plot_conn_pushforward_quantiles(samples2, names, data$years[idxs_t],
                                       xlab="", ylab=expression(delta[shock]), 
                                       display_ylim=c(-8, 3), display_xlim = c(1980, 2022))
  abline(v =  c(2002,2006), lty = 3, col="darkgrey")
  
  names <- sapply(1:data$N_all_years,
                  function(x) paste0('f_sh[', data$stand_idxs[t],',',x, ']'))
  util$plot_conn_pushforward_quantiles(samples2, names, data$all_years,
                                       xlab="", ylab='Stand GP', 
                                       display_ylim=c(-8, 3), display_xlim = c(1980, 2022))
  abline(v = c(2002,2006), lty = 3, col="darkgrey")
}

5.2 On 20 stands

We now use more data. This model took 10 hours to run on less than 1000 trees.

data <- readRDS("/home/victor/projects/climategrowthshifts/analysis/pnwandmore/output/model/data_15nov2025_azPIPO.rds")
data$rw_obs <- ifelse(data$log_rw_obs == log(1e-4), 0, exp(data$log_rw_obs))

if(FALSE){
  fit3 <- stan(file=file.path(wd, 'model', 'stan', 'model9_marginalshocks_zerotauinner_censoring_onlyconcordant_heterogeneous.stan'),
              data=data, seed=5838293, chains = 4,
              warmup=1000, iter=2024, refresh=10)
  saveRDS(fit3, file.path(wd, 'model/shocks/output/marginal', 'fit_normal_censoring_heterogeneous_morestands.rds'))
}else{
  fit3 <- readRDS(file.path(wd, 'model/shocks/output/marginal', 'fit_normal_censoring_heterogeneous_morestands.rds'))
}

When considering more data, we do not see warnings anymore.

diagnostics <- util$extract_hmc_diagnostics(fit3)
util$check_all_hmc_diagnostics(diagnostics)
  All Hamiltonian Monte Carlo diagnostics are consistent with reliable
Markov chain Monte Carlo.
samples3 <- util$extract_expectand_vals(fit3)
base_samples <- util$filter_expectands(samples3,
                                       c('alpha', 
                                         'beta_gdd', 'beta_sm', 'beta_vpd',
                                         'rho_sh', 'gamma_sh',
                                         'rho_sp', 'gamma_sp',
                                         'sigma', 
                                         'phi_sck', 'omega_conc_sck'),
                                       check_arrays = TRUE)
util$check_all_expectand_diagnostics(base_samples)
rho_sp[1]:
  Chain 1: hat{ESS} (98.880) is smaller than desired (100).
  Chain 2: hat{ESS} (42.658) is smaller than desired (100).
  Chain 4: hat{ESS} (87.057) is smaller than desired (100).


Small empirical effective sample sizes result in imprecise Markov chain
Monte Carlo estimators.

Let’s look at the pair plot between \sigma and \gamma_\text{short} again!

Let’s look at the retrodictive check for the same stand (1) than before:

And stand 2:

We can then look at the shocks across the 20 stands! That’s cool.

6 Species differences in shockiness

6.1 A model with every Pinus ponderosa

data <- readRDS("/home/victor/projects/climategrowthshifts/analysis/pnwandmore/output/model/data_03dec2025_PIPO.rds")
data$rw_obs <- ifelse(data$log_rw_obs == log(1e-4), 0, exp(data$log_rw_obs))

# Model fitted on GenOuest
if(TRUE){
  fit_pipo <- readRDS(file.path(wd, 'model/shocks/output', 'fit_03dec2025_PIPO.rds'))
  diagnostics_pipo <- util$extract_hmc_diagnostics(fit_pipo)
  saveRDS(diagnostics_pipo, file.path(wd, 'model/shocks/output', 'diag_03dec2025_PIPO.rds'))
}else{
  diagnostics_pipo <- readRDS(file.path(wd, 'model/shocks/output', 'diag_03dec2025_PIPO.rds'))
}
util$check_all_hmc_diagnostics(diagnostics_pipo)
  All Hamiltonian Monte Carlo diagnostics are consistent with reliable
Markov chain Monte Carlo.

6.1.1 Retrodictive check

Let’s look at 8 random trees!